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Abstract 

Multimedia learning is a cognitive theory of learning which has 
been popularized by the work of Richard E. Mayer and others. 
Multimedia learning happens when we build mental 
representations from words and pictures. The theory has largely 
been defined by Mayer’s cognitive theory of multimedia learning. 
Generally, the theory tries to address the issue of how to structure 
multimedia instructional practices and employ more effective 
cognitive strategies to help people leam efficiently. Baddeley’s 
model of working memory, Paivio’s dual coding theory, and 
Sweller’s theory of cognitive load are integral theories that support 
the overall theory of multimedia learning. The theory can be 
summarized as having the following components: (a) a dual¬ 
channel structure of visual and auditory channels, (b) limited 
processing capacity in memory, (c) three memory stores (sensory, 
working, long-term), (d) five cognitive processes of selecting, 
organizing, and integrating (selecting words, selecting images, 
organizing work, organizing images, and integrating new 
knowledge with prior knowledge), and theory-grounded and 
evidence-based multimedia instructional methods. Important 
considerations for implementing the theory are discussed, as well 
as current trends and future directions in research. 

Introduction 

The cognitive theory of multimedia learning was popularized by 
the work of Richard E. Mayer and other cognitive researchers who 
argue that multimedia supports the way that the human brain 
learns. They assert that people learn more deeply from words and 
pictures than from words alone, which is referred to as the 
multimedia principle (Mayer 2005a). Multimedia researchers 
generally define multimedia as the combination of text and 
pictures; and suggest that multimedia learning occurs when we 
build mental representations from these words and pictures 
(Mayer, 2005b). The words can be spoken or written, and the 
pictures can be any form of graphical imagery including 
illustrations, photos, animation, or video. Multimedia instructional 
design attempts to use cognitive research to combine words and 
pictures in ways that maximize learning effectiveness. 



Cognitive Theory of Multimedia Learning 


The theoretical foundation for the cognitive theory of multimedia 
learning (CTML)draws from several cognitive theories including 
Baddeley’s model of working memory, Paivio’s dual coding 
theory, and Sweller’s Theory of Cognitive Load. As a cognitive 
theory of learning, it falls under the larger framework of cognitive 
science and the information-processing model of cognition. The 
infonnation processing model suggests several information stores 
(memory) that are governed by processes that convert stimuli to 
infonnation (Moore, Burton & Myers, 2004). Cognitive science 
studies the nature of the brain and how it learns by drawing from 
research in a number of areas including psychology, neuroscience, 
artificial intelligence, computer science, linguistics, philosophy, 
and biology. The term cognitive refers to perceiving and knowing. 
Cognitive scientists seek to understand mental processes such as 
perceiving, thinking, remembering, understanding language, and 
learning (Stillings, Weisler, Chase, Feinstein, Garfield, & Rissland, 
1995). As such, cognitive science can provide powerful insight 
into human nature, and, more importantly, the potential of humans 
to develop more efficient methods using instructional technology 
(Sorden, 2005). 


Key Elements of the Theory 

The cognitive theory of multimedia learning (CTML) centers on 
the idea that learners attempt to build meaningful connections 
between words and pictures and that they leam more deeply than 
they could have with words or pictures alone (Mayer, 2009). 
According to CTML, one of the principle aims of multimedia 
instruction is to encourage the learner to build a coherent mental 
representation from the presented material. The learner’s job is to 
make sense of the presented material as an active participant, 
ultimately constructing new knowledge. 

According to Mayer and Moreno (1998) and Mayer (2003), CTML 
is based on three assumptions: the dual-channel assumption, the 
limited capacity assumption, and the active processing assumption. 
The dual-channel assumption is that working memory has auditory 
and visual channels based on Baddeley’s (1986) theory of working 
memory and Paivio’s (1986; Clark and Paivio, 1991) dual coding 
theory. Second, the limited capacity assumption is based on 
cognitive load theory (Sweller, 1988,1994) and states that each 
subsystem of working memory has a limited capacity. The third 
assumption is the active processing assumption which suggests that 
people construct knowledge in meaningful ways when they pay 
attention to the relevant material, organize it into a coherent mental 
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structure, and integrate it with their prior knowledge (Mayer, 1996, 
1999). 

The Three Store Structure of Memory in CTML 

CTML accepts a model that includes three memory stores known 
as sensory memory, working memory, and long-term memory. 
Sweller (2005) defines sensory memory as the cognitive structure 
that permits us to perceive new infonnation, working memory as 
the cognitive structure in which we consciously process 
information, and long-term memory as the cognitive structure that 
stores our knowledge base. We are only conscious of information 
in long-term memory when it has been transferred to working 
memory. Mayer (2005a) states that sensory memory has a visual 
sensory memory that briefly holds pictures and printed text as 
visual images; and auditory memory that briefly holds spoken 
words and sounds as auditory images. Schnotz (2005) refers to 
sensory memory as sensory registers or sensory channels and 
points out that though we tend to view the dual channel sensors as 
eye-to-visual working memory and ear-to-auditory working 
memory, that it is possible for other sensory channels to introduce 
information to working memory such as “reading” with the fingers 
through Braille or a deaf person being able to “hear” by reading 
lips. 

Working memory attends to, or selects infonnation from sensory 
memory for processing and integration. Sensory memory holds an 
exact sensory copy of what was presented for less than .25 of a 
second, while working memory holds a processed version of what 
was presented for generally less than thirty seconds and can 
process only a few pieces of material at any one time (Mayer 
2010a). Long-term memory holds the entire store of a person’s 
knowledge for an indefinite amount of time. Figure 1 is a 
representation of how memory works according to Mayer’s 
cognitive theory of multimedia learning. 


Sensory 

memory 
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Figure 1 Mayer’s Cognitive Theory of Multimedia Learning 
(Mayer 2010a) 


Mayer (2005a) states that there are also live forms of 
representation of words and pictures that occur as information is 
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processed by memory. Each fonn represents a particular stage of 
processing in the three memory stores model of multimedia 
learning. The first form of representation is the words and pictures 
in the multimedia presentation itself. The second form is the 
acoustic representation (sounds) and iconic representation (images) 
in sensory memory. The third fonn is the sounds and images in 
working memory. The fourth form of representation is the verbal 
and pictorial models which are also found in working memory. 

The fifth form is prior knowledge, or schemas, which are stored in 
long-term memory. 

According to CTML, content knowledge is contained in schemas 
which are cognitive constructs that organize information for 
storage in long tenn memory. Schemas organize simpler elements 
that can then act as elements in higher order schemas. As learning 
occurs, increasingly sophisticated schemas are developed and 
learned procedures are transferred from controlled to automatic 
processing. Automation frees capacity in working memory for 
other functions. This process of developing increasingly 
complicated schemas that build on each other is also similar to the 
explanation given by Chi, Glaser, and Rees (1982) for the 
transition from novice to expert in a domain. 

The Development of the Theory of Working Memory 

The current conception of working memory in CTML grew out of 
Atkinson & Shiffrin’s (1968) model of short term memory. The 
Atkinson & Shiffrin model was viewed primarily as a structure for 
temporarily storing information before it passed to long-tenn 
memory. Eventually, researchers began to question some of the 
assumptions of short-term memory and a few started to look for 
better explanations. Baddeley and Hitch (1974) subsequently 
proposed a more complex model of short-term memory which they 
called working memory. Their model for working memory was a 
system with subcomponents that not only held temporary 
information, but processed it so that several pieces of verbal or 
visual information could be stored and integrated. 

Baddeley (1986, 1999) later proposed that there was an additional 
component in working memory called the central executive. 
According to the theory, the central executive controlled the two 
subcomponents of working memory, known as the visuo-spatial 
sketch pad and the phonological loop. The central executive also 
was responsible for controlling the overall system and engaging in 
problem solving tasks and focusing attention. Baddeley theorized 
that the central executive could transfer storage tasks to the two 
subcomponent systems in working memory, so that the central 
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executive would continue to have capacity for perfonning more 
demanding selection and information processing tasks. 

The visuo-spatial sketch pad is assumed to maintain and 
manipulate visual images. The phonological loop stores and 
rehearses verbal infonnation. It has also been suggested that the 
phonological loop has an important function of facilitating the 
acquisition of language by maintaining a new word in working 
memory until it can be learned (Baddeley, Gathercole, &Papagno, 
1998). Baddeley (2002) eventually proposed the addition of a third 
subsystem known as the episodic buffer, which has acquired some 
of the tasks that were originally attributed to the central executive 
(now seen as a purely attentional system). The episodic buffer 
functions as a storage structure which acts as a limited capacity 
interface to integrate multiple sources of infonnation from other 
slave systems. 

Sweller (2005) and Yuan, Steedle, Shavelson, Alonzo & Oppezo 
(2006) suggest that while there is strong evidence for the two main 
subcomponents in working memory, that there is less evidence for 
a central executive that consciously attends to information in 
sensory memory. Rather, Sweller suggests that schemas which 
exist in long-term memory serve as the executive function, 
ultimately directing working memory to attend to information that 
fits pre-existing schemas. Schemas detennine which infonnation 
enters working memory because we tend to pay attention to 
infonnation that fits the knowledge that we already have. This 
would support the idea that our paradigms cause us to focus on 
infonnation that fits our existing beliefs, while ignoring 
infonnation that does not fit neatly into our understanding of the 
world. 

Meaningful Learning 

Mayer (2010a) argues that meaningful learning from words and 
pictures happens when the learner engages in five cognitive 
processes: 

1. selecting relevant words for processing in verbal working 
memory 

2. selecting relevant images for processing in visual working 
memory 

3. organizing selected words into a verbal model 

4. organizing selected images into a pictorial model 

5. integrating the verbal and pictorial representations with 
each other and with prior knowledge. 
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These cognitive processes in working memory determine which 
infonnation is attended to or selected, which knowledge is 
retrieved from long tenn memory and integrated with new the 
infonnation to construct new knowledge, and ultimately, which 
bits of new knowledge are transferred to long-term memory. 
Knowledge that is constructed in working memory is transfened to 
long-term memory through the process of encoding (Mayer, 
2008b). However, Dwyer & Dwyer (2006) caution that proper 
encoding requires rehearsal and since rehearsal takes time, the 
multimedia lesson must allow an adequate period for incubation or 
it can be ineffective. Hasler, Kersten, & Sweller (2007) add that 
this is why learner control is important when using animation in 
multimedia learning. 

Mayer (2009) distinguishes meaningful learning from “no 
learning” and “rote learning” and describes it as active learning 
where the learner constructs knowledge. Meaningful learning is 
demonstrated when the learner can apply what is presented in new 
situations, and students perfonn better on problem-solving transfer 
tests when they learn with words and pictures. Mayer (2008b) also 
identifies two types of transfer: transfer of learning and problem¬ 
solving transfer. Transfer of learning occurs when previous 
learning affects new learning. Problem solving transfer occurs 
when previous learning affects the ability to solve new problems. 
Mayer defines learning as a “change in knowledge attributable to 
experience” (2009, p. 59). Learning is personal and cannot be 
directly observed because it happens with the learner’s cognitive 
system. It must be inferred through a change in behavior such as 
perfonnance on a task or test. 

Cognitive Load 

The limited capacity assumption states that there is a limit to the 
amount of infonnation that can be processed at one time by 
working memory. In other words, learning is hindered when 
cognitive overload occurs and working memory capacity is 
exceeded (De Jong, 2010).DeLeeuw & Mayer (2008) theorize that 
there are three types of cognitive processing (essential, extraneous, 
and generative)and place them in the triarchic model of cognitive 
load. Mayer (2009) made this model the organizing framework for 
the cognitive theory of multimedia learning and stated that a major 
goal of multimedia learning and instruction is to “manage essential 
processing, reduce extraneous processing and foster generative 
processing” ( p. 57).The model is heavily based on Sweller’s 
cognitive load theory (Chandler & Sweller, 1991; Sweller, 1988, 
1994). 
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According to Sweller, Van Merrienboer, and Paas (1998), there are 
three types of cognitive load: intrinsic, extraneous, and gennane. 
Intrinsic cognitive load occurs during the interaction between the 
nature of the material being learned and the expertise of the 
learner. The second type, extraneous cognitive load, is caused by 
factors that aren’t central to the material to be learned, such as 
presentation methods or activities that split attention between 
multiple sources of information, and these should be minimized as 
much as possible. The third type of cognitive load, germane 
cognitive load, enhances learning and results in task resources 
being devoted to schema acquisition and automation. Intrinsic 
cognitive load cannot be manipulated, but extraneous and gennane 
cognitive load can. 

In the triarchic model of cognitive load, essential processing 
(intrinsic load) relates to the essential material or information to be 
learned. Extraneous processing (extrinsic load) does not serve the 
instructional goal or purpose and reduces the chances that transfer 
of learning will occur. Generative processing (germane cognitive 
load) is aimed at making sense of the presented material. It is the 
activity of organizing and integrating infonnation in working 
memory. 

De Jong (2010) has called into question whether there is truly a 
distinction between intrinsic (essential) and gennane (generative) 
cognitive load, writing that if “intrinsic load and germane load are 
defined in terms of relatively similar learning processes, the 
difference between the two seems to be very much a matter of 
degree, and possibly non-existent” (p. 11 l).Deleeuw and Mayer 
(2008), however, did report finding that extraneous, essential, and 
generative processing appear to be able to be measured by different 
assessment instruments, suggesting that they are three distinct 
constructs. 

The Science of Instruction 

The previous sections describe what Mayer (2009) calls the 
science of learning, while this section explains what Mayer calls 
the science of instruction and defines as the “creation of evidence- 
based principles for helping people learn” (2009, pp. 29), or more 
simply as the “scientific study of how to help people learn” 

(Mayer, 2010a, p. 543).Mayer insists that research on multimedia 
instruction must be theory-grounded and evidence-based. Theory- 
grounded means that each principle, method and concept is derived 
from a theory of multimedia learning. Evidence-based means that 
each principle, method and concept is supported by an empirical 
base of replicated findings from rigorous and appropriate research 
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studies, which yields testable predictions. Mayer (2011a) 
subsequently adds the science of assessment to the sciences of 
learning and instruction to form what he calls the “Big Three” (p. 
2 ). 

As part of his evidence-seeking efforts for the science of 
instruction, Mayer (2009) identifies the following twelve 
multimedia instructional principles which were developed from 
nearly 100 studies over the past two decades: 

• Coherence Principle - People leam better when extraneous 
material is excluded rather than included. 

• Signaling Principle - People learn better when cues that 
highlight the organization of the essential material are 
added. 

• Redundancy Principle - People leam better from graphics 
and narration than from graphics, narration, and printed 
text. 

• Spatial Contiguity Principle - People learn better when 
corresponding words and pictures are placed near each 
other rather than far from each other on the page or screen. 

• Temporal Contiguity Principle - People learn better when 
corresponding words and pictures are presented at the same 
time rather than in succession. 

• Segmenting Principal - People learn better when a 
multimedia lesson is presented in user-paced segments 
rather than as a continuous unit. 

• Pre-training Principle - People learn more deeply from a 
multimedia message when they receive pre-training in the 
names and characteristics of key components. 

• Modality Principle - People learn better from graphics and 
narration than from graphics and printed text. 

• Multimedia Principle - People leam better from words and 
pictures than from words alone. 

• Personalization Principle - People learn better from a 
multimedia presentation when the words are in 
conversational style rather than in fonnal style. 

• Voice Principle - People learn better when the words in a 
multimedia message are spoken by a friendly human voice 
rather than a machine voice. 

• Image Principle - People do not necessarily learn more 
deeply from a multimedia presentation when the speaker’s 
image is on the screen rather than not on the screen. 
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As mentioned earlier, these twelve principles are grouped ina 
framework based on the three types of cognitive load (Mayer 
2009): 

• reducing extraneous processing - coherence, signaling, 
redundancy, spatial contiguity, temporal contiguity 

• managing essential processing - segmenting, pre-training, 
modality 

• fostering generative processing - multimedia, 
personalization, voice, image 

In addition to these instructional principles, Mayer (2009)includes 
boundary conditions that can detennine the effectiveness of some 
of the principles. These boundary conditions are a recent addition 
to the theory, and they suggest that the instructional principles in 
CTML are not universal, absolute rules. Some have criticized the 
existence of boundary conditions in CTML as an indicator that the 
theory has inconsistencies (De Jong, 2010), but Mayer (2010b) 
views boundary conditions as a healthy evolution in CTML that 
allows the theory to continue to develop and be implemented 
realistically, rather than as a set of immutable rules that have to be 
followed in all situations. 

One example of a boundary condition is that of individual- 
differences, which states that some instructional methods or 
principles may be more effective for low-knowledge learners than 
for high-knowledge learners (Mayer 2009; Schnotz and Bannert, 
2003). Kalyuga, Ayres, Chandler & Sweller (2003) have called this 
the expertise-reversal effect. Paas, Renkl, & Sweller (2004, pp.2-3) 
similarly stated this from a CLT point of view when they wrote: 

“A cognitive load that is germane for a novice may be extraneous 
for an expert. In other words, information that is relevant to the 
process of schema construction for a beginning learner may hinder 
this process for a more advanced learner.” Another example of a 
boundary condition is the complexity and pacing condition, which 
suggests that some of these methods may be more effective when 
the material of the lesson is complex or the pace of the presentation 
is fast. Each principle in CTML is subject to boundary conditions 
as illustrated by Mayer (2009). 

Although they haven’t appeared in recent CTML literature, Mayer 
suggests several “advanced” principles for multimedia learning in 
his 2005 book, The Cambridge Handbook of Multimedia Learning, 
which are listed as chapters by various authors. These should be 
considered as possible areas for future CTML research and not 
necessarily evidence-based principles. 
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• Animation and interactivity principles - People don’t 
necessarily learn better from animation than from static 
diagrams. 

• Cognitive aging principle - Instructional design principles 
that effectively expand the capacity of working memory are 
particularly helpful for older learners. 

• Collaboration principle - People leam better when involved 
in collaborative online learning activities. 

• Guided-discovery principle - People learn better when 
guidance is incorporated into discovery-based multimedia 
environments. 

• Navigation principles - People learn better in environments 
where appropriate navigational aids are provided. 

• Prior knowledge principle - Instructional principles that are 
effective in increasing multimedia learning for novices may 
have the opposite effect on more expert learners. 

• Self-explanation principle - People learn better when they 
are encouraged to generate self-explanations during 
learning. 

• Site map principle - People learn better in an online 
environment when presented with a map showing where 
they are in a lesson. 

• Worked-out example principle - People leam better when 
worked-out examples are given in initial skill learning. 

In addition to the twelve principles and the advanced principles 
listed in this chapter, Mayer (2011a) discusses several more 
principles that have appeared in CTML literature over the years. 
This demonstrates once again that the cognitive theory of 
multimedia learning is dynamic. Therefore, the twelve principles 
should not be taken as a rigid canon, but rather a starting point for 
discussion. Mayer (2011b), for example, only lists ten principles 
just two years after he published the twelve principles, having 
dropped the multimedia and image principles. In fact, this number 
seems to vary from publication to publication, so the focus should 
be on understanding what the latest research suggests about the 
effectiveness of the various instructional methods, rather than 
memorizing a codified set of twelve, or any other number of 
principles. 

Development of the Theory 

The evolution of CTML literature and research is evident in the 
body of work published by Mayer and his colleagues over the past 
twenty years (Mayer, 2005a). Mayer reminds us that even though 
the name has changed over the years, the underlying elements of 
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the theory have not changed. In fact, the theory appears to have 
matured as it enters its third decade of active research and is finally 
reaching a consistently recognizable state. 

See Moore, Burton, & Myers (2004) for an excellent overall 
accounting of the theoretical and research foundations of 
multimedia learning and Yuan et al. (2006) for the extensive 
history of working memory. The actual cognitive theory of 
multimedia learning first begins to emerge as a distinct theory at 
the end of the 1980s when Mayer (1989) introduced the theory as 
the “model of meaningful learning” and then shortly thereafter as 
the “cognitive conditions for effective illustrations” (Mayer & 
Gallini, 1990). It has also been called the “dual-coding model” 
(Mayer & Anderson, 1991, 1992), “generative theory” (Mayer, 
Steinhoff, Bower, & Mars, 1995), the “generative theory of 
multimedia learning” (Mayer, 1997: Plass, Chun, Mayer, & 
Leutner, 1998), and the “dual-processing model of multimedia 
learning” (Mayer & Moreno, 1998). 

The name “cognitive theory of multimedia learning” was first used 
in Mayer, Bove, Bryman, Mars, and Tapangco (1996), but didn’t 
become the standard name for Mayer’s theory until the year 2000 
and beyond. The various models over the years focused on 
different aspects of the current model, but the underlying 
assumptions remained unchanged. Elements such as cognitive 
processes and mental representations were slowly added and 
refined until we have the model currently described by Mayer 
(2009). 

It is important to note that before her death, Roxana Moreno, a 
fonner student of Mayer’s, had begun to develop a cognitive- 
affective theory of learning with media .{Moreno 2005; 2006; 

2007). Moreno (2005) includes factors of self-regulation and 
motivation in this theory and explained that this new model 
extends the cognitive theory of multimedia learning by “integrating 
assumptions regarding the relationship between cognition, 
metacognition and motivation and affect” (2007, p. 767). Moreno 
& Mayer (2007) assert that the cognitive-affective theory of 
learning with media (CATLM) “expands the cognitive theory of 
multimedia learning to media such as virtual reality, agent-based, 
and case-based learning environments” (p. 313). 

Moreno’s model integrates three assumptions. The first assumption 
is that humans have a limited working memory capacity 
(Baddeley, 1992). The second assumption is that long-tenn 
memory consists of past experiences and general domain 
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knowledge, which is similar to Tulving’s (1977) distinction 
between episodic and semantic memory systems. The third 
assumption is that motivational factors affect learning by 
increasing or decreasing cognitive engagement (Pintrich, 
2003).Paas (1992) discussed a similar distinction between mental 
load and mental effort from a CLT perspective nearly two decades 
ago. 

Measurement and Instruments 

There is no one single measurement instrument that is associated 
with CTML research. Mayer (2009) states that since the goal is to 
make a causal claim about instructional effectiveness, that one of 
the most useful approaches in CTML research is quantitative 
experimental comparisons, with random assignment and 
experimental control being two important features. The main 
question in this type of research is whether a particular 
instructional method is effective. CTML researchers generally try 
to identify instructional methods that cause large effect sizes of .8 
or greater across many different experimental comparisons. 
Learning is generally measured through tests of retention and 
transfer, and much of the recent research has focused on the 
instructional methods discussed earlier in this chapter. 

Because of its central role in CTML research, cognitive load 
theory research is also of interest. De Jong (2010) provides a 
lengthy criticism of the instruments and tests of measurement in 
cognitive load theory. He points out that one of the most frequently 
used methods for measuring CLT is self-reporting in a one-item 
questionnaire where learners indicate their perceived amount of 
mental effort. De Jong asserts that this approach often leads to 
inconsistency in the outcomes of studies that use this type of 
questionnaire. Another way that cognitive load has been measured 
is physiologically using indicators such as heart rate, blood 
pressure, and pupillary reactions. A third way of measuring 
cognitive load has been through the dual-task or secondary-task 
approach which indicates increased consumption of cognitive 
resources in the primary task when slower or inaccurate 
perfonnance on the on the secondary task occurs (Briinken, Plass, 
& Leutner, 2003). De Jong criticizes the measurement of cognitive 
load as a single construct, as most of these approaches tend to do. 
He calls for the development of better instruments and 
multidimensional scales that can reliably measure intrinsic, 
extraneous, and gennane load separately. 

Applying the Cognitive Theory of Multimedia Instruction 
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Once we understand the science of learning and the science of 
instruction, the next question becomes how to apply the principles 
in order to foster meaningful learning. See Mayer’s (2011a) 
Applying the Science of Learning for a good overview of what to 
consider when applying the methods described in this chapter, as 
well as others, 

This section looks at what to keep in mind as the instructional 
methods in CTLM are implemented. In addition to applying the 
twelve principles and the advanced principles presented in this 
chapter and in Mayer (2005a, 2009, 2011b), the instructional 
designer should be aware of the infonnation presented in this 
section when creating multimedia instruction. These theories come 
from the cognitive theory of multimedia learning, cognitive load 
theory, and cognitive science in general. It should be remembered 
that they are theories, and as such should be applied with caution, 
but all of them have research and a theoretical background that 
make them worth considering as guidelines for creating better 
instruction. 

The principles of multimedia learning should be viewed as 
instructional methods whose primary goal is to foster meaningful 
learning. An instructional method is a way of presenting a lesson; 
it does not change the content of the lesson—the covered content is 
the same. As discussed previously, the principles should not be 
viewed as absolute rules that have to be applied equally in every 
situation. They are guidelines that should be adjusted depending on 
the intended audience, the goals of the instruction, and boundary 
conditions such as the expertise level of the learner. Most 
important, the theory is a learner-centered learning theory (Mayer, 
2009). 

Learner-Centered Focus 

A critical perspective to maintain while designing multimedia 
lessons according to CTML is that the multimedia instructional 
methods are learner-centered—they are not technology-centered 
approaches. Mayer (2009) reminds us that multimedia can be as 
simple as a still image with words and that it is the instructional 
method, not the technology that matters. Multimedia instructional 
designers often fall victim to letting the technology drive the 
instructional design, rather than looking at the design from the 
perspective, and limitations, of the learner. 

Moreno (2006a) expressed this idea when she distinguished 
between a method-affects-learning hypothesis versus a media- 
affects-learning hypothesis. A media-affects-learning approach 
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could best be described as what occurred in the 20 th Century when 
state-of-the-art technologies such as radio, television, computers, 
and the Internet were introduced into education with the 
assumption that they would improve education simply because 
they were better tools than had previously been available. 

Managing Cognitive Load 

Because the principles of CTML are organized around the three 
types of cognitive load, designing instruction according to 
cognitive load theory (CLT) research findings is important if you 
are designing according to CTML. Mayer, Fennell, Farmer, and 
Campbell (2004) cite evidence that two important ways to promote 
meaningful learning are to design activities that reduce cognitive 
load, which frees working memory capacity for deep cognitive 
processing during learning, and to increase the learner’s interest, 
which encourages the learner to use this freed capacity for deep 
processing during learning. CLT suggests that for instruction to be 
effective, care must be taken to design instruction in a way as to 
not overload the brain’s capacity for processing infonnation. 

CLT suggests that instructional techniques that require students to 
engage in activities that aren’t directed at schema acquisition and 
automation can quickly exceed the limited capacity of working 
memory and hinder learning objectives. In simple tenns, this 
means that you shouldn’t create unnecessary activities in 
connection with a lesson that require excessive attention or 
concentration that may overload working memory and prevent one 
from acquiring the essential infonnation that is to be learned. This 
is an important guideline in any fonn of instruction, but it is an 
essential rule in multimedia instruction because of the ease with 
which distractions can be incorporated. Instructional designers 
should not fill this limited capacity with unnecessary, flashy bells 
and whistles (Sorden, 2005). 

An example of what this means for multimedia instructional design 
is that the layout should be visually appealing and intuitive, but 
that the activities should remain focused on the concepts to be 
learned, rather than trying too much to entertain. This is especially 
true if the entertainment is time consuming to construct and 
complicated for the learner to master. Working memory can be 
overloaded by the entertainment or activity before the learner ever 
gets to the concept or skill to be learned. Mayer (2009) states that 
effective “instructional design depends on techniques for reducing 
extraneous processing, managing essential processing, and 
fostering generative processing” (p. 57). 
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Schnotz and Kurschner (2007) echo this idea by stating that 
techniques to simply reduce cognitive load can be 
counterproductive. They argue that learning tasks should be 
adapted to the learner’s zone of proximal development which in 
turn depends on the learner’s level of expertise, and that intrinsic 
and gennane cognitive load should be promoted while extraneous 
cognitive load is reduced. De Jong (2010) states that the three main 
recommendations that cognitive load theory has contributed to the 
field of instructional design are: “present material that aligns with 
the prior knowledge of the learner (intrinsic load), avoid non- 
essential and confusing information (extraneous load), and 
stimulate processes that lead to conceptually rich and deep 
knowledge (gennane load)” (p. 111). These cognitive load 
processes occur simultaneously in working memory, are limited in 
capacity, and can only occur at the expense of the other two. If 
true, this creates important considerations for multimedia learning. 

Task Analysis 

Task analysis is tied to the concepts of schemas and levels of 
expertise. The multimedia lesson should try to ensure that the 
learner has sufficiently automated key core knowledge or tasks. 

The learner should do this before trying to tackle an overall task 
that may be beyond the learner’s current ability range, which could 
cause unnecessary frustration and possibly even cause the learner 
to drop out of the activity. The theories of Vygotsky’s Zone of 
Proximal Development and Piaget’s concept of scaffolding can be 
applied here. This suggests that a task analysis should be done 
during the instructional design of a multimedia lesson in order to 
breakdown the skills and information that are needed to learn or 
perform the educational objective. 

Guided Instruction 

According to CTML, guided instruction and worked examples are 
preferable to discovery learning, even though other learning 
theories often support discovery learning as a useful component of 
multimedia instruction. Mayer (2004; 2011a) and Kirschner, 
Sweller, & Clark (2006) caution against using discovery learning 
and argue that guided instruction is much more effective. Mayer 
(2011a) presents four principles for “studying by practicing” that 
support this idea. The four principles supporting guided instruction 
are spacing, feedback, worked example, and guided discovery. 

Interactivity 

While the principle of interactivity still requires more research, 
much of the literature suggests that infusing interactivity such as 
learner control, feedback, and guidance into a multimedia lesson 
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will increase the affective conditions that will improve learning 
transfer and perfonnance (Mayer, 2009; Piaget, 1969; Renkl & 
Atkinson, 2007; Wittrock, 1990). Domagk, Schwartz, andPlass 
(2010) define interactivity as “reciprocal activity between a learner 
and a multimedia learning system, in which the [re] action of the 
learner is dependent up the [redaction of the system and vice 
versa” (p. 1025). They propose a model of interactivity called the 
Integrated Model of Multimedia Interactivity (INTERACT) which 
consists of six principal components of an integrated learning 
system: the learning environment, behavioral activities, cognitive 
and metacognitive activities, motivation and emotion, learner 
variables, and the learner’s mental model (learning outcomes). 
Moreno & Mayer (2007) also describe an interactive multimodal 
environment that is based on the cognitive affective theory of 
learning with media (CATLM) and include five design principles 
of guided activity, reflection, feedback, pacing, and pre-training. 

Animation and Screencasts 

Hasler, Kersten, & Sweller (2007) suggest that animation can be 
more effective when learners are allowed to stop and start the 
animation instead of having it just play through in one pass, 
however this still leaves the question of whether still images are 
ultimately just as affective and much easier and cheaper to 
produce. 

Regarding the use of animation to improve student achievement, 
Dwyer & Dwyer (2006) suggest that animation is not a viable 
instructional tool for improving achievement when the content to 
be learned is hierarchically structured. They go on to state that 
previous research does indicate that animation can be effectively 
used to teach both factual and conceptual types of information, but 
that this content can be taught equally well at less cost with other 
instructional strategies. Schnotz (2008) raises similar questions. 
This does not necessarily discount CTML studies, as CTML 
researchers have argued that simple graphical images can be highly 
effective when combined with words, and have already called into 
question whether animation is superior to still images in the 
“advanced” principles of animation and interactivity (Betrancourt, 
2005). 


Evaluation of the Theory 

Validation 

Theories are meant to be advanced upon and ultimately cast aside 
as new information is integrated and new understanding is 
developed. Moreno (2006a), for example, writes that “we should 
concede as cognitive scientists, that valid criticisms can be raised 
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against any existing theory of cognition and that such criticism is 
essential to progress. Theories and constructs are useful only as 
long as they evolve in their heuristic, explanatory, and predictive 
functions” (p. 179). While the cognitive theory of multimedia 
learning has generally met with acceptance, there remains 
questions by various learning and education theorists in certain 
quarters about its validity, as well as the validity of other cognitive 
theories upon which it is based. Mayer and his colleagues, 
however, counter that there is an extensive body of research that 
does validate this theory. 

In recent years there have been several prominent researchers who 
have continued to develop the cognitive theories of multimedia 
learning and cognitive load. Among these are Richard E. Mayer, 
Roxana Moreno, John Sweller, Jan Plass, and Wolfgang Schnotz. 
Significant studies have included Mayer & Anderson (1991); 
Moreno & Mayer (2000); Schnotz & Bannert (2003); Pass, Renkl 
& Sweller (2004); and Plass, Chun, Mayer, & Leutner (2004). Gall 
(2004) points out that much of Mayer’s research has been 
published in top peer-reviewed journals such as the Journal of 
Educational Psychology and is available for deeper study and 
critique. Dacosta (2008) provides a detailed table of almost 70 
published studies by CTML researchers on instructional principles, 
along with the number of experiments and the particular principle 
each study measured. For a substantial listing of dozens of CTML 
studies that support each of the twelve multimedia instructional 
principles presented in this chapter, see Mayer (2009). Finally, 
Yuan et al. (2006) also cite a series of studies that suggest that 
working memory performance correlates with cognitive abilities 
and academic achievement. 

Mayer (2009) states that his research goal is to contribute to the 
cognitive theory of multimedia learning, and ultimately to practical 
applied instructional practice. While criticizing the technology- 
centered use of multimedia for instruction and the misapplication 
of cognitive load theory, Mayer (2005b), Ballantyne (2008), and 
Schnotz (2008) have all stated that it is the instructional method 
that is important, not the technology, no matter how sophisticated. 
Ultimately, the validation of the theory lies in the fact that it has a 
large body of studies and literature to support it, that it has 
exhibited “staying power” and that it continues to demand 
attention and exert influence in the fields of education and training. 
The power of the theory lies in its dynamic structure, in which it is 
expected and even driven to constantly change and morph as new 
infonnation is discovered and tested in the field of cognitive 
science. 
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Critique 

There are critics of the theory, however, and the use of multimedia 
for instruction has been challenged (Clark & Feldon, 2005; Tufte, 
2003). Ballantyne (2008) has criticized the narrowness of some of 
the CTML studies and whether the principles derived from CTML 
research can be applied in broader, more realistic settings. 
Additionally, striking at the very heart of the cognitive theory of 
multimedia learning, Rasch & Schnotz (2009) were not able to 
show that students actually learned better from text and pictures 
than from text alone, calling the multimedia principle itself into 
question. They also could not show that students learned better 
from interactive pictures than from non-interactive ones. 

Gall (2004) points out that Mayer’s research has tended to focus 
mainly on the understanding of physical and mechanical systems, 
and thus raises the question of how applicable his results are to 
nondidactic, immersive learning environments. This criticism of 
whether results obtained in controlled experimental situations can 
be applied to dynamic classrooms and learning environments is an 
old complaint that has been leveled at psychology since 
psychologists first began studying and trying to measure learning. 
Often, these charges of non-relevance to real-life learning and 
instruction have been justified. But Mayer is careful not to claim 
that his research should be seen as the final word on instruction in 
the situations he is trying to measure. Rather, it is obvious in the 
evolution of the cognitive theory of multimedia learning that they 
are only trying to detennine what appears to make a difference in 
learning situations, hypothesize about it, and then continue to look 
for better explanations and hypotheses. The theory is dynamic and 
the expectation that it will continue to grow, adapt, and change 
appears frequently in the literature. 

There has also been general criticism of infonnation-processing 
theory and cognitive science on which CTML is based, and there 
have also been some negative critiques of certain aspects of 
cognitive load theory. Gerjets, Scheiter, & Cierniak (2008) for 
example, state that according to the traditional critical rationalism 
of Popper, CLT cannot be considered a scientific theory because 
some of its fundamental assumptions cannot be tested empirically. 
However, Gerjets et al. go on to suggest that in spite of this 
limitation, CLT can still be viewed as a scientific theory under 
Sneed’s structuralist view of theories. De Jong (2010) asserts that 
many studies supporting CLT make speculative interpretations of 
what happened, but that only when a suitable measure of cognitive 
load is developed can these interpretations be considered valid. 
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Several cognitive and CTML researchers have also challenged 
CLT as they attempt to evolve the theory and address perceived 
shortcomings (Moreno, 2006; Schnotz and Kurschner, 2007; 
Gerjets et ah, 2009). De Jong (2010) provides an excellent critical 
analysis of cognitive load theory which discusses many 
problematic areas that currently exist in the existing body of 
research and literature. 

De Jong (2010) points out that cognitive load theory is used to 
suggest guidelines for instructional design, which assumes that 
CLT research results are applicable to real-life situations, which de 
Jong questions. For example, de Jong cites several recent studies 
that could not find support for the modality principal when learner 
control is increased. In response to de Jong’s article, Mayer 
(2010b) responds that criticism such as de Jong’s is welcome and 
can only strengthen the cognitive theory of multimedia learning as 
weaknesses are exposed and researched in ways that contribute to 
the on-going evolution of the theory. 

Current Trends and Future Directions 

As has already been pointed out, the field of research for the 
cognitive theory of multimedia learning is very active; new studies 
and literature are being added every year. While Richard Mayer, 
Jan Plass, John Sweller, and the late Roxanna Moreno continued to 
publish research and books in support of their theories, many 
others have also contributed to the growing and maturing field. 
Dissertations, for example, are a way to gauge general trends and 
the overall vitality of the field. 

As a quick search in the ProQuest database will attest, there are 
dozens of dissertations that have been added in the last five years 
which have studied some aspect of Mayer’s cognitive theory of 
multimedia learning. The following are just a few examples of 
recent dissertations and their findings. Lu (2008) found that 
animated instructions with narration lead to better perfonnance on 
retention tests, possibly due to less cognitive load on the learner. 
Lu also found that levels of learner control may not benefit 
learners when learners do not have enough prior experience. Dong 
(2007) found that when positive emotions are elicited through an 
aesthetically-pleasing interface design, it can result in deeper 
learning, at least for low prior knowledge learners. Dacosta (2008) 
tentatively reported that his study was not able to reproduce the 
modality effect reported by Mayer & Moreno (Mayer, 1998, 
Experiments 1 and 2; Moreno & Mayer, 1999a, Experiments 1 and 
2; 2002a, Experiments 1 and 2; Moreno et ah, 2001, Experiments 
4a and 4b and 5a and 5b), stating that it did not appear that middle- 
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aged learners attained a higher degree of meaningful learning from 
animation with concurrent narration than animation with 
concurrent printed text. Um (2008) reported that positive emotions 
induced before learning begins, facilitate cognitive processes and 
other affective experiences in a multimedia-based learning 
environment and that the positive mood state continued until the 
end of the learning and made significant effects on learning 
performance, cognitive load, motivation, satisfaction and 
perception of the learning. Finally, Musallam (2010) found that 
students who received pre-training through a screencast prior to 
instruction reported a statistically significant decrease in perceived 
mental effort, as well as an increase in performance when 
compared to students who did not receive pre-training. 

Major trends at the moment appear to be continued research on the 
various CTML principles as well as identifying possible new ones. 
The focus on the affective domain in multimedia learning also 
seems to be strengthening as noted in the recent dissertations by 
students of CTML researcher Jan Plass (Dong, 2007; Um, 2008) 
and recent research by Moreno & Mayer (2007) on interactivity 
and the CATLM; and McLaren, DeLeeuw, & Mayer (2011) on 
whether learning is enhanced when web-based intelligent tutors 
use polite language instead of direct language in instruction. 
Another promising area for future research is that of interactivity, 
especially as intelligent tutoring system technology continues to 
improve (Domagk, Schwartz & Plass, 2010; McLaren, DeLeeuw 
& Mayer, 2011; Moreno & Mayer, 2007). 

Possibly a holy grail for CTML and CTL researchers in the near 
future will be the quest to find evidence of three separate cognitive 
load processes (essential, extraneous, generative) and how to 
measure them reliably (Brunken, Plass, & Leutner, 2003; 
Antonenko, Paas, Grabner, and van Gog, 2010). A recent study by 
Antonenko & Niederhauser (2010) found that measuring overall 
cognitive load with self-reports may not be adequate. Instead, they 
suggest that cognitive load should be viewed as a dynamic process 
and assessed with EEG-based measures to provide a more 
complete picture for explaining the causes and effects of cognitive 
load. 

While not directly focusing on the cognitive theory of multimedia 
instruction, the use of electroencephalography, or brain waves, to 
measure increased cognitive load as learners engage in tasks 
provides an interesting new development in CTML and CLT 
theory. The use of electroencephalography has been around for 
many years (Gerlic & Jausovec, 1999; Klimesch, 1999), but it is 
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only recently that the technology has advanced to the point of 
possibly being able to non-obtrusively and reliably measure what 
appears to be cognitive load while learners engage in nonnal 
learning activities. Jaeggi et al. (2007), for example, observed 
specific patterns of brain activity when cognitive load was thought 
to be high. This neuroscientific approach may also eventually be 
able to help with distinguishing between intrinsic, extraneous, and 
germane cognitive load, which has not been possible up to this 
point. Since CLT is a central concept of CTML, a breakthrough in 
CLT measurement technology as this study suggests could be very 
significant for the future of CTML research. 

Mayer (2011b), states that more research is needed including “(a) 
the continued discovery of evidence-based principles for 
multimedia design particularly in authentic learning situations; and 
(b) research that pinpoints the boundary conditions of multimedia 
design principles” (p. 441). Mayer notes that new multimedia 
technology is emerging faster than the science of instruction is 
developing, and that more research in these new areas is needed, 
especially as textbooks migrate to computer-based media, 
instructional multimedia games and simulations become more 
prevalent, and mobile learning on hand-held devices becomes 
common. He reiterates that it is the instructional method that 
matters, not the sophistications of the technology and believes that 
we do not need more comparison research between technologies or 
unscientific studies on the development of new multimedia 
technologies for instruction. 

As de Jong (2010) and Mayer (2010b) point out, there are many, 
many unresolved issues in CLT and CTML research, which means 
that the field continues to be wide open and should provide a 
challenging and stimulating area of research for many years to 
come. De Jong suggests that research in CLT should now turn to 
finding load-reducing approaches for intensive knowledge- 
producing strategies such as learning from multiple 
representations, self-explanations, inquiry learning, or game-based 
learning which all stimulate gennane (generative) processes. While 
speaking to cognitive load theory, de Jong’s recommendation for 
future research can be applied as easily to the cognitive theory of 
multimedia learning in detennining: “(1) which instructional 
treatments lead to which cognitive processes (and how), (2) what 
the corresponding effects are on memory workload and potential 
overload, (3) what characteristics of the learning material and the 
student mediate these effects and (4) how best to measure effects 
on working memory load in a theory-related manner” (p. 127). 
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Conclusion 

The cognitive theory of multimedia learning has progressed over 
the past two decades and is poised to become a mature, robust 
theory as it enters its third decade. Fortunately, the theoretical 
cognitive foundations upon which the theory is based go much 
further back and have contributed heavily to its framework of the 
“big three” sciences, as well as the structure given to its principles 
by the triarchic theory of cognitive load. Together, these two areas 
of study fonn what we generally understand today to be the 
cognitive theory of multimedia learning. 

The theory is expanding into exciting new areas that will allow it 
to continue to evolve. Its learner-centered and cognitive- 
constructivist orientation makes it very relevant in current 
educational applications. The fact that it focuses on finding 
effective instructional methods rather than a specific technology 
makes it a dynamic theory that will allow it to expand well beyond 
the life cycle of any particular technology. 

While the theory continues to have problematic and unanswered 
areas, the researchers acknowledge this and expect that the theory 
will continue to develop and change as new and better research 
techniques are developed for the study of how we learn and how 
the human brain works. It is an exciting field that is developing 
very quickly due to advances in technology and neuroscience, and 
there is a great need for new researchers to contribute new 
scientific studies to the development of the theory, the principles, 
the boundary conditions, and finally, the “big three” sciences of 
learning, instruction, and assessment themselves. 
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